Goto

Collaborating Authors

 tertiary structure


BeeRNA: tertiary structure-based RNA inverse folding using Artificial Bee Colony

Mlaweh, Mehyar, Cazenave, Tristan, Alaya, Ines

arXiv.org Artificial Intelligence

The Ribonucleic Acid (RNA) inverse folding problem, designing nucleotide sequences that fold into specific tertiary structures, is a fundamental computational biology problem with important applications in synthetic biology and bioengineering. The design of complex three-dimensional RNA architectures remains computationally demanding and mostly unresolved, as most existing approaches focus on secondary structures. In order to address tertiary RNA inverse folding, we present BeeRNA, a bio-inspired method that employs the Artificial Bee Colony (ABC) optimization algorithm. Our approach combines base-pair distance filtering with RMSD-based structural assessment using RhoFold for structure prediction, resulting in a two-stage fitness evaluation strategy. To guarantee biologically plausible sequences with balanced GC content, the algorithm takes thermodynamic constraints and adaptive mutation rates into consideration. In this work, we focus primarily on short and medium-length RNAs ($<$ 100 nucleotides), a biologically significant regime that includes microRNAs (miRNAs), aptamers, and ribozymes, where BeeRNA achieves high structural fidelity with practical CPU runtimes. The lightweight, training-free implementation will be publicly released for reproducibility, offering a promising bio-inspired approach for RNA design in therapeutics and biotechnology.


Hierarchical Data-efficient Representation Learning for Tertiary Structure-based RNA Design

Tan, Cheng, Zhang, Yijie, Gao, Zhangyang, Cao, Hanqun, Li, Stan Z.

arXiv.org Artificial Intelligence

While artificial intelligence has made remarkable strides in revealing the relationship between biological macromolecules' primary sequence and tertiary structure, designing RNA sequences based on specified tertiary structures remains challenging. Though existing approaches in protein design have thoroughly explored structure-to-sequence dependencies in proteins, RNA design still confronts difficulties due to structural complexity and data scarcity. Adding to the problem, direct transplantation of protein design methodologies into RNA design fails to achieve satisfactory outcomes although sharing similar structural components. In this study, we aim to systematically construct a data-driven RNA design pipeline. We crafted a large, well-curated benchmark dataset and designed a comprehensive structural modeling approach to represent the complex RNA tertiary structure. More importantly, we proposed a hierarchical data-efficient representation learning framework that learns structural representations through contrastive learning at both cluster-level and sample-level to fully leverage the limited data. By constraining data representations within a limited hyperspherical space, the intrinsic relationships between data points could be explicitly imposed. Moreover, we incorporated extracted secondary structures with base pairs as prior knowledge to facilitate the RNA design process. Extensive experiments demonstrate the effectiveness of our proposed method, providing a reliable baseline for future RNA design tasks. The source code and benchmark dataset will be released publicly.


Generating Tertiary Protein Structures via an Interpretative Variational Autoencoder

Guo, Xiaojie, Du, Yuanqi, Tadepalli, Sivani, Zhao, Liang, Shehu, Amarda

arXiv.org Machine Learning

Much scientific enquiry across disciplines is founded upon a mechanistic treatment of dynamic systems that ties form to function. A highly visible instance of this is in molecular biology, where an important goal is to determine functionally-relevant forms/structures that a protein molecule employs to interact with molecular partners in the living cell. This goal is typically pursued under the umbrella of stochastic optimization with algorithms that optimize a scoring function. Research repeatedly shows that current scoring function, though steadily improving, correlate weakly with molecular activity. Inspired by recent momentum in generative deep learning, this paper proposes and evaluates an alternative approach to generating functionally-relevant three-dimensional structures of a protein. Though typically deep generative models struggle with highly-structured data, the work presented here circumvents this challenge via graph-generative models. A comprehensive evaluation of several deep architectures shows the promise of generative models in directly revealing the latent space for sampling novel tertiary structures, as well as in highlighting axes/factors that carry structural meaning and open the black box often associated with deep models. The work presented here is a first step towards interpretative, deep generative models becoming viable and informative complementary approaches to protein structure prediction.


AlphaFold makes its mark in predicting protein structures

#artificialintelligence

Players applaud, say words like Whoo, bang plastic knives on the table and enjoy the best weekends with artificial intelligence as the main act, thanks to AI unleashed in games. WIRED UK's science editor, Matt Reynolds, looked at DeepMind's impact on AI milestones: "It has outplayed Go champions, bested professional StarCraft players and turned its attention to chess and shogi." Let the games continue but the serious stuff must seriously shine. In brief, we can admire that unleashing AI for the purpose of scientific discovery has become especially alive and well thanks to research at DeepMind. Tech watchers commented this week on research papers showing the strengths of AI. "As AI matures as a field (and runs out of video games to conquer) probably more of its achievements will look like these: solid improvements in important research domains."


Machine Learning Can Solve Rubik's Cubes Now

#artificialintelligence

Deep-learning machines have figured out how to master games like chess or Mortal Kombat. Now, computer scientists at the University of California, Irvine taken things to the third dimension by creating an algorithm that can figure out how to solve a Rubik's Cube, a surprisingly difficult change. "Our algorithm is able to solve 100 percent of randomly scrambled cubes while achieving a median solve length of 30 moves - less than or equal to solvers that employ human domain knowledge," say the scientists in the abstract to their paper, up on Arvix. The algorithm, called DeepCube, uses what's known as "autodidactic iteration," a form of machine learning developed by the authors of the paper. The big challenge of autodidactic iteration was to allow machines to find their own rewards in solving a puzzle, a goal they can reach.


Machine Learning Finally Tackles the Rubik's Cube

#artificialintelligence

Deep-learning machines have figured out how to master games like chess or Mortal Kombat. Now, computer scientists at the University of California, Irvine taken things to the third dimension by creating an algorithm that can figure out how to solve a Rubik's Cube, a surprisingly difficult change. "Our algorithm is able to solve 100 percent of randomly scrambled cubes while achieving a median solve length of 30 moves -- less than or equal to solvers that employ human domain knowledge," say the scientists in the abstract to their paper, up on Arvix. The algorithm, called DeepCube, uses what's known as "autodidactic iteration," a form of machine learning developed by the authors of the paper. The big challenge of autodidactic iteration was to allow machines to find their own rewards in solving a puzzle, a goal they can reach.


Artificial Intelligence and Molecular Biology

Hunter, Lawrence

AI Magazine

Molecular biology is emerging as an important domain for artificial intelligence research. The advantages of biology for design and testing of AI systems include large amounts of available online data, significant (but incomplete) background knowledge, a wide variety of problems commensurate with AI technologies, clear standards of success, cooperative domain experts, non-military basic research support and percieved potential for practical (and profitable) applications. These considerations have motivated a growing group of researchers to pursue both basic and applied AI work in the domain. More than seventy-five researchers working on these problems gathered at Stanford for a AAAI sponsored symposium on the topic. This article provides a description of much of the work presented at the meeting, and fills in the basic biology background necessary to place it in context.